Overview

Dataset statistics

Number of variables45
Number of observations683788
Missing cells253521
Missing cells (%)0.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.1 GiB
Average record size in memory1.8 KiB

Variable types

NUM19
CAT17
BOOL9

Reproduction

Analysis started2020-06-07 13:19:22.651867
Analysis finished2020-06-07 13:25:15.319605
Duration5 minutes and 52.67 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

state has constant value "New York" Constant
created_at has a high cardinality: 483 distinct values High cardinality
spc_latin has a high cardinality: 132 distinct values High cardinality
spc_common has a high cardinality: 132 distinct values High cardinality
problems has a high cardinality: 232 distinct values High cardinality
address has a high cardinality: 408701 distinct values High cardinality
nta has a high cardinality: 188 distinct values High cardinality
nta_name has a high cardinality: 188 distinct values High cardinality
borocode is highly correlated with community board and 3 other fieldsHigh correlation
community board is highly correlated with borocode and 3 other fieldsHigh correlation
st_senate is highly correlated with st_assemHigh correlation
st_assem is highly correlated with st_senateHigh correlation
boro_ct is highly correlated with community board and 3 other fieldsHigh correlation
x_sp is highly correlated with longitudeHigh correlation
longitude is highly correlated with x_spHigh correlation
y_sp is highly correlated with latitudeHigh correlation
latitude is highly correlated with y_spHigh correlation
council district is highly correlated with cncldistHigh correlation
cncldist is highly correlated with council districtHigh correlation
bin is highly correlated with community board and 3 other fieldsHigh correlation
bbl is highly correlated with community board and 3 other fieldsHigh correlation
borough is highly correlated with zip_cityHigh correlation
zip_city is highly correlated with boroughHigh correlation
health has 31616 (4.6%) missing values Missing
spc_latin has 31619 (4.6%) missing values Missing
spc_common has 31619 (4.6%) missing values Missing
steward has 31615 (4.6%) missing values Missing
guards has 31616 (4.6%) missing values Missing
sidewalk has 31616 (4.6%) missing values Missing
problems has 31664 (4.6%) missing values Missing
bin has 9559 (1.4%) missing values Missing
bbl has 9559 (1.4%) missing values Missing
tree_id has unique values Unique
tree_dbh has 17932 (2.6%) zeros Zeros
stump_diam has 666134 (97.4%) zeros Zeros

Variables

tree_id
Real number (ℝ≥0)

UNIQUE

Distinct count683788
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean365205.01108530717
Minimum3
Maximum722694
Zeros0
Zeros (%)0.0%
Memory size5.2 MiB

Quantile statistics

Minimum3
5-th percentile38056.35
Q1186582.75
median366214.5
Q3546170.25
95-th percentile687716.65
Maximum722694
Range722691
Interquartile range (IQR)359587.5

Descriptive statistics

Standard deviation208122.0929
Coefficient of variation (CV)0.5698774293
Kurtosis-1.192862994
Mean365205.0111
Median Absolute Deviation (MAD)179812.5
Skewness-0.01716124126
Sum2.497228041e+11
Variance4.331480555e+10
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
20471< 0.1%
 
6388771< 0.1%
 
6265871< 0.1%
 
6245381< 0.1%
 
6306811< 0.1%
 
6286321< 0.1%
 
6511591< 0.1%
 
6491101< 0.1%
 
6552531< 0.1%
 
6532041< 0.1%
 
6429631< 0.1%
 
6409141< 0.1%
 
6470571< 0.1%
 
6450081< 0.1%
 
6019991< 0.1%
 
5999501< 0.1%
 
6060931< 0.1%
 
6040441< 0.1%
 
5938031< 0.1%
 
6368281< 0.1%
 
6327341< 0.1%
 
5978971< 0.1%
 
6347831< 0.1%
 
5774591< 0.1%
 
5754101< 0.1%
 
Other values (683763)683763> 99.9%
 
ValueCountFrequency (%) 
31< 0.1%
 
41< 0.1%
 
71< 0.1%
 
81< 0.1%
 
91< 0.1%
 
101< 0.1%
 
111< 0.1%
 
121< 0.1%
 
131< 0.1%
 
141< 0.1%
 
ValueCountFrequency (%) 
7226941< 0.1%
 
7226931< 0.1%
 
7226921< 0.1%
 
7226911< 0.1%
 
7226901< 0.1%
 
7226891< 0.1%
 
7226881< 0.1%
 
7226871< 0.1%
 
7226861< 0.1%
 
7226851< 0.1%
 

block_id
Real number (ℝ≥0)

Distinct count101390
Unique (%)14.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean313793.0962359679
Minimum100002
Maximum999999
Zeros0
Zeros (%)0.0%
Memory size5.2 MiB

Quantile statistics

Minimum100002
5-th percentile107430
Q1221556
median319967
Q3404624
95-th percentile510007
Maximum999999
Range899997
Interquartile range (IQR)183068

Descriptive statistics

Standard deviation114839.0243
Coefficient of variation (CV)0.3659705255
Kurtosis-0.5123830828
Mean313793.0962
Median Absolute Deviation (MAD)91359
Skewness0.08163277621
Sum2.145679537e+11
Variance1.31880015e+10
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
2048504500.1%
 
6023623580.1%
 
208115250< 0.1%
 
506756206< 0.1%
 
233208197< 0.1%
 
340498195< 0.1%
 
111902178< 0.1%
 
302421159< 0.1%
 
501930145< 0.1%
 
340497135< 0.1%
 
404057133< 0.1%
 
412069132< 0.1%
 
208130129< 0.1%
 
313163115< 0.1%
 
203224112< 0.1%
 
340200107< 0.1%
 
215490107< 0.1%
 
233203105< 0.1%
 
415097103< 0.1%
 
111782100< 0.1%
 
507767100< 0.1%
 
50262997< 0.1%
 
32586793< 0.1%
 
51572393< 0.1%
 
23320093< 0.1%
 
Other values (101365)67989699.4%
 
ValueCountFrequency (%) 
1000024< 0.1%
 
10000314< 0.1%
 
1000043< 0.1%
 
1000054< 0.1%
 
1000142< 0.1%
 
1000155< 0.1%
 
1000165< 0.1%
 
1000185< 0.1%
 
1000192< 0.1%
 
1000285< 0.1%
 
ValueCountFrequency (%) 
9999995< 0.1%
 
6030841< 0.1%
 
6030833< 0.1%
 
6030826< 0.1%
 
6030816< 0.1%
 
6030779< 0.1%
 
6030754< 0.1%
 
60307436< 0.1%
 
60307334< 0.1%
 
60307235< 0.1%
 

created_at
Categorical

HIGH CARDINALITY

Distinct count483
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size5.2 MiB
10/27/2015
 
6852
10/13/2015
 
6676
10/26/2015
 
6258
10/29/2015
 
6223
10/15/2015
 
6085
Other values (478)
651694
ValueCountFrequency (%) 
10/27/201568521.0%
 
10/13/201566761.0%
 
10/26/201562580.9%
 
10/29/201562230.9%
 
10/15/201560850.9%
 
11/03/201560280.9%
 
10/14/201560100.9%
 
10/30/201558340.9%
 
10/20/201556770.8%
 
10/22/201553680.8%
 
11/02/201553220.8%
 
10/16/201551510.8%
 
10/23/201550690.7%
 
11/04/201550650.7%
 
10/07/201550270.7%
 
10/10/201549190.7%
 
11/05/201547800.7%
 
10/21/201545420.7%
 
09/25/201543620.6%
 
10/19/201543190.6%
 
10/28/201543160.6%
 
11/18/201542480.6%
 
10/12/201542190.6%
 
09/23/201541690.6%
 
10/08/201541640.6%
 
Other values (458)55310580.9%
 

Length

Max length10
Median length10
Mean length10
Min length10

Overview of Unicode Properties

Unique unicode characters11
Unique unicode categories (?)2
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
0149800521.9%
 
/136757620.0%
 
1132854319.4%
 
2103151315.1%
 
55999978.8%
 
63177234.6%
 
81857212.7%
 
91781442.6%
 
71601382.3%
 
31061961.6%
 
4643240.9%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number547030480.0%
 
Other Punctuation136757620.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
0149800527.4%
 
1132854324.3%
 
2103151318.9%
 
559999711.0%
 
63177235.8%
 
81857213.4%
 
91781443.3%
 
71601382.9%
 
31061961.9%
 
4643241.2%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
/1367576100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Common6837880100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
0149800521.9%
 
/136757620.0%
 
1132854319.4%
 
2103151315.1%
 
55999978.8%
 
63177234.6%
 
81857212.7%
 
91781442.6%
 
71601382.3%
 
31061961.6%
 
4643240.9%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII6837880100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
0149800521.9%
 
/136757620.0%
 
1132854319.4%
 
2103151315.1%
 
55999978.8%
 
63177234.6%
 
81857212.7%
 
91781442.6%
 
71601382.3%
 
31061961.6%
 
4643240.9%
 

tree_dbh
Real number (ℝ≥0)

ZEROS

Distinct count146
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.27978701000895
Minimum0
Maximum450
Zeros17932
Zeros (%)2.6%
Memory size5.2 MiB

Quantile statistics

Minimum0
5-th percentile2
Q14
median9
Q316
95-th percentile28
Maximum450
Range450
Interquartile range (IQR)12

Descriptive statistics

Standard deviation8.723042269
Coefficient of variation (CV)0.7733339522
Kurtosis46.97759883
Mean11.27978701
Median Absolute Deviation (MAD)5
Skewness2.429472373
Sum7712983
Variance76.09146642
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
4603728.8%
 
3544548.0%
 
2419776.1%
 
5416426.1%
 
11379785.6%
 
6365195.3%
 
7308624.5%
 
8308284.5%
 
10296724.3%
 
9289034.2%
 
12258533.8%
 
13244413.6%
 
14218273.2%
 
15196432.9%
 
18192762.8%
 
0179322.6%
 
16178132.6%
 
17156982.3%
 
19126551.9%
 
20114281.7%
 
21106451.6%
 
22102591.5%
 
25102351.5%
 
2391551.3%
 
2485311.2%
 
Other values (121)551908.1%
 
ValueCountFrequency (%) 
0179322.6%
 
128990.4%
 
2419776.1%
 
3544548.0%
 
4603728.8%
 
5416426.1%
 
6365195.3%
 
7308624.5%
 
8308284.5%
 
9289034.2%
 
ValueCountFrequency (%) 
4501< 0.1%
 
4251< 0.1%
 
3891< 0.1%
 
3182< 0.1%
 
2981< 0.1%
 
2931< 0.1%
 
2911< 0.1%
 
2821< 0.1%
 
2811< 0.1%
 
2661< 0.1%
 

stump_diam
Real number (ℝ≥0)

ZEROS

Distinct count100
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.43246298560372515
Minimum0
Maximum140
Zeros666134
Zeros (%)97.4%
Memory size5.2 MiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum140
Range140
Interquartile range (IQR)0

Descriptive statistics

Standard deviation3.29024074
Coefficient of variation (CV)7.608144164
Kurtosis145.298141
Mean0.4324629856
Median Absolute Deviation (MAD)0
Skewness10.36345045
Sum295713
Variance10.82568413
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
066613497.4%
 
49660.1%
 
59390.1%
 
37790.1%
 
67540.1%
 
127170.1%
 
107160.1%
 
146600.1%
 
86600.1%
 
156480.1%
 
76120.1%
 
136090.1%
 
205720.1%
 
185670.1%
 
165570.1%
 
95300.1%
 
175260.1%
 
115250.1%
 
224260.1%
 
194100.1%
 
244040.1%
 
253980.1%
 
23630.1%
 
213490.1%
 
30338< 0.1%
 
Other values (75)36290.5%
 
ValueCountFrequency (%) 
066613497.4%
 
1106< 0.1%
 
23630.1%
 
37790.1%
 
49660.1%
 
59390.1%
 
67540.1%
 
76120.1%
 
86600.1%
 
95300.1%
 
ValueCountFrequency (%) 
1401< 0.1%
 
1341< 0.1%
 
1311< 0.1%
 
1251< 0.1%
 
1201< 0.1%
 
1181< 0.1%
 
1151< 0.1%
 
1091< 0.1%
 
1071< 0.1%
 
1041< 0.1%
 

curb_loc
Categorical

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.2 MiB
OnCurb
656896
OffsetFromCurb
 
26892
ValueCountFrequency (%) 
OnCurb65689696.1%
 
OffsetFromCurb268923.9%
 

Length

Max length14
Median length6
Mean length6.314623831
Min length6

Overview of Unicode Properties

Unique unicode characters13
Unique unicode categories (?)2
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
r71068016.5%
 
O68378815.8%
 
C68378815.8%
 
u68378815.8%
 
b68378815.8%
 
n65689615.2%
 
f537841.2%
 
s268920.6%
 
e268920.6%
 
t268920.6%
 
F268920.6%
 
o268920.6%
 
m268920.6%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter292339667.7%
 
Uppercase Letter139446832.3%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
O68378849.0%
 
C68378849.0%
 
F268921.9%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
r71068024.3%
 
u68378823.4%
 
b68378823.4%
 
n65689622.5%
 
f537841.8%
 
s268920.9%
 
e268920.9%
 
t268920.9%
 
o268920.9%
 
m268920.9%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin4317864100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
r71068016.5%
 
O68378815.8%
 
C68378815.8%
 
u68378815.8%
 
b68378815.8%
 
n65689615.2%
 
f537841.2%
 
s268920.6%
 
e268920.6%
 
t268920.6%
 
F268920.6%
 
o268920.6%
 
m268920.6%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII4317864100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
r71068016.5%
 
O68378815.8%
 
C68378815.8%
 
u68378815.8%
 
b68378815.8%
 
n65689615.2%
 
f537841.2%
 
s268920.6%
 
e268920.6%
 
t268920.6%
 
F268920.6%
 
o268920.6%
 
m268920.6%
 

status
Categorical

Distinct count3
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.2 MiB
Alive
652173
Stump
 
17654
Dead
 
13961
ValueCountFrequency (%) 
Alive65217395.4%
 
Stump176542.6%
 
Dead139612.0%
 

Length

Max length5
Median length5
Mean length4.979582853
Min length4

Overview of Unicode Properties

Unique unicode characters13
Unique unicode categories (?)2
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e66613419.6%
 
A65217319.2%
 
l65217319.2%
 
i65217319.2%
 
v65217319.2%
 
S176540.5%
 
t176540.5%
 
u176540.5%
 
m176540.5%
 
p176540.5%
 
D139610.4%
 
a139610.4%
 
d139610.4%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter272119179.9%
 
Uppercase Letter68378820.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
A65217395.4%
 
S176542.6%
 
D139612.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e66613424.5%
 
l65217324.0%
 
i65217324.0%
 
v65217324.0%
 
t176540.6%
 
u176540.6%
 
m176540.6%
 
p176540.6%
 
a139610.5%
 
d139610.5%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin3404979100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e66613419.6%
 
A65217319.2%
 
l65217319.2%
 
i65217319.2%
 
v65217319.2%
 
S176540.5%
 
t176540.5%
 
u176540.5%
 
m176540.5%
 
p176540.5%
 
D139610.4%
 
a139610.4%
 
d139610.4%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII3404979100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e66613419.6%
 
A65217319.2%
 
l65217319.2%
 
i65217319.2%
 
v65217319.2%
 
S176540.5%
 
t176540.5%
 
u176540.5%
 
m176540.5%
 
p176540.5%
 
D139610.4%
 
a139610.4%
 
d139610.4%
 

health
Categorical

MISSING

Distinct count3
Unique (%)< 0.1%
Missing31616
Missing (%)4.6%
Memory size5.2 MiB
Good
528850
Fair
 
96504
Poor
 
26818
ValueCountFrequency (%) 
Good52885077.3%
 
Fair9650414.1%
 
Poor268183.9%
 
(Missing)316164.6%
 

Length

Max length4
Median length4
Mean length3.953763447
Min length3

Overview of Unicode Properties

Unique unicode characters9
Unique unicode categories (?)2
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
o111133641.1%
 
G52885019.6%
 
d52885019.6%
 
a1281204.7%
 
r1233224.6%
 
F965043.6%
 
i965043.6%
 
n632322.3%
 
P268181.0%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter205136475.9%
 
Uppercase Letter65217224.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
G52885081.1%
 
F9650414.8%
 
P268184.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
o111133654.2%
 
d52885025.8%
 
a1281206.2%
 
r1233226.0%
 
i965044.7%
 
n632323.1%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin2703536100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
o111133641.1%
 
G52885019.6%
 
d52885019.6%
 
a1281204.7%
 
r1233224.6%
 
F965043.6%
 
i965043.6%
 
n632322.3%
 
P268181.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2703536100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
o111133641.1%
 
G52885019.6%
 
d52885019.6%
 
a1281204.7%
 
r1233224.6%
 
F965043.6%
 
i965043.6%
 
n632322.3%
 
P268181.0%
 

spc_latin
Categorical

HIGH CARDINALITY
MISSING

Distinct count132
Unique (%)< 0.1%
Missing31619
Missing (%)4.6%
Memory size5.2 MiB
Platanus x acerifolia
87014
Gleditsia triacanthos var. inermis
 
64264
Pyrus calleryana
 
58931
Quercus palustris
 
53185
Acer platanoides
 
34189
Other values (127)
354586
ValueCountFrequency (%) 
Platanus x acerifolia8701412.7%
 
Gleditsia triacanthos var. inermis642649.4%
 
Pyrus calleryana589318.6%
 
Quercus palustris531857.8%
 
Acer platanoides341895.0%
 
Tilia cordata297424.3%
 
Prunus292794.3%
 
Zelkova serrata292584.3%
 
Ginkgo biloba210243.1%
 
Styphnolobium japonicum193382.8%
 
Acer rubrum172462.5%
 
Fraxinus pennsylvanica162512.4%
 
Tilia americana135302.0%
 
Acer saccharinum122771.8%
 
Liquidambar styraciflua106571.6%
 
Quercus rubra84001.2%
 
Tilia tomentosa79951.2%
 
Ulmus americana79751.2%
 
Acer70801.0%
 
Prunus cerasifera68791.0%
 
Quercus bicolor65981.0%
 
Acer platanoides 'Crimson King'59230.9%
 
Ulmus parvifolia53450.8%
 
Prunus virginiana48880.7%
 
Syringa reticulata45680.7%
 
Other values (107)9033313.2%
 
(Missing)316194.6%
 

Length

Max length34
Median length16
Mean length17.35580911
Min length3

Overview of Unicode Properties

Unique unicode characters50
Unique unicode categories (?)4
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
a159174713.4%
 
i10495718.8%
 
r9932198.4%
 
s8358217.0%
 
8293627.0%
 
l6922575.8%
 
e6899885.8%
 
u6852135.8%
 
n6830245.8%
 
c5840304.9%
 
t5390914.5%
 
o4527693.8%
 
m2303361.9%
 
p1916391.6%
 
P1902261.6%
 
y1807411.5%
 
d1791431.5%
 
v1243041.0%
 
b1174711.0%
 
f1143531.0%
 
h1112470.9%
 
x1094110.9%
 
A930240.8%
 
G886520.7%
 
Q828670.7%
 
Other values (25)4281883.6%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter1029786686.8%
 
Space Separator8293627.0%
 
Uppercase Letter6640155.6%
 
Other Punctuation764510.6%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
P19022628.6%
 
A9302414.0%
 
G8865213.4%
 
Q8286712.5%
 
T531258.0%
 
Z292584.4%
 
C263404.0%
 
S252133.8%
 
F203793.1%
 
U149152.2%
 
L122211.8%
 
M109301.6%
 
K96421.5%
 
R17840.3%
 
B14000.2%
 
J13960.2%
 
O10810.2%
 
E9150.1%
 
N288< 0.1%
 
H221< 0.1%
 
I138< 0.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
a159174715.5%
 
i104957110.2%
 
r9932199.6%
 
s8358218.1%
 
l6922576.7%
 
e6899886.7%
 
u6852136.7%
 
n6830246.6%
 
c5840305.7%
 
t5390915.2%
 
o4527694.4%
 
m2303362.2%
 
p1916391.9%
 
y1807411.8%
 
d1791431.7%
 
v1243041.2%
 
b1174711.1%
 
f1143531.1%
 
h1112471.1%
 
x1094111.1%
 
k542010.5%
 
g525620.5%
 
j215580.2%
 
q136770.1%
 
z248< 0.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
829362100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
.6460584.5%
 
'1184615.5%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin1096188192.4%
 
Common9058137.6%
 

Most frequent Latin characters

ValueCountFrequency (%) 
a159174714.5%
 
i10495719.6%
 
r9932199.1%
 
s8358217.6%
 
l6922576.3%
 
e6899886.3%
 
u6852136.3%
 
n6830246.2%
 
c5840305.3%
 
t5390914.9%
 
o4527694.1%
 
m2303362.1%
 
p1916391.7%
 
P1902261.7%
 
y1807411.6%
 
d1791431.6%
 
v1243041.1%
 
b1174711.1%
 
f1143531.0%
 
h1112471.0%
 
x1094111.0%
 
A930240.8%
 
G886520.8%
 
Q828670.8%
 
k542010.5%
 
Other values (22)2975362.7%
 

Most frequent Common characters

ValueCountFrequency (%) 
82936291.6%
 
.646057.1%
 
'118461.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII11867694100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
a159174713.4%
 
i10495718.8%
 
r9932198.4%
 
s8358217.0%
 
8293627.0%
 
l6922575.8%
 
e6899885.8%
 
u6852135.8%
 
n6830245.8%
 
c5840304.9%
 
t5390914.5%
 
o4527693.8%
 
m2303361.9%
 
p1916391.6%
 
P1902261.6%
 
y1807411.5%
 
d1791431.5%
 
v1243041.0%
 
b1174711.0%
 
f1143531.0%
 
h1112470.9%
 
x1094110.9%
 
A930240.8%
 
G886520.7%
 
Q828670.7%
 
Other values (25)4281883.6%
 

spc_common
Categorical

HIGH CARDINALITY
MISSING

Distinct count132
Unique (%)< 0.1%
Missing31619
Missing (%)4.6%
Memory size5.2 MiB
London planetree
87014
honeylocust
 
64264
Callery pear
 
58931
pin oak
 
53185
Norway maple
 
34189
Other values (127)
354586
ValueCountFrequency (%) 
London planetree8701412.7%
 
honeylocust642649.4%
 
Callery pear589318.6%
 
pin oak531857.8%
 
Norway maple341895.0%
 
littleleaf linden297424.3%
 
cherry292794.3%
 
Japanese zelkova292584.3%
 
ginkgo210243.1%
 
Sophora193382.8%
 
red maple172462.5%
 
green ash162512.4%
 
American linden135302.0%
 
silver maple122771.8%
 
sweetgum106571.6%
 
northern red oak84001.2%
 
silver linden79951.2%
 
American elm79751.2%
 
maple70801.0%
 
purple-leaf plum68791.0%
 
swamp white oak65981.0%
 
crimson king maple59230.9%
 
Chinese elm53450.8%
 
'Schubert' chokecherry48880.7%
 
Japanese tree lilac45680.7%
 
Other values (107)9033313.2%
 
(Missing)316194.6%
 

Length

Max length22
Median length12
Mean length11.55410449
Min length3

Overview of Unicode Properties

Unique unicode characters42
Unique unicode categories (?)5
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e109034113.8%
 
a7292479.2%
 
n7251019.2%
 
l6320938.0%
 
o5904067.5%
 
r5480566.9%
 
5159596.5%
 
p3908624.9%
 
t2896643.7%
 
i2613773.3%
 
y2094942.7%
 
h2057592.6%
 
s2037912.6%
 
d2001832.5%
 
c1871252.4%
 
m1828642.3%
 
k1597302.0%
 
u1238271.6%
 
g931641.2%
 
L870141.1%
 
w869141.1%
 
C662110.8%
 
v537320.7%
 
f462550.6%
 
b362690.5%
 
Other values (17)1851202.3%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter707597689.6%
 
Space Separator5159596.5%
 
Uppercase Letter2890703.7%
 
Other Punctuation112630.1%
 
Dash Punctuation82900.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e109034115.4%
 
a72924710.3%
 
n72510110.2%
 
l6320938.9%
 
o5904068.3%
 
r5480567.7%
 
p3908625.5%
 
t2896644.1%
 
i2613773.7%
 
y2094943.0%
 
h2057592.9%
 
s2037912.9%
 
d2001832.8%
 
c1871252.6%
 
m1828642.6%
 
k1597302.3%
 
u1238271.7%
 
g931641.3%
 
w869141.2%
 
v537320.8%
 
f462550.7%
 
b362690.5%
 
z295750.4%
 
q83< 0.1%
 
x64< 0.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
515959100.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
L8701430.1%
 
C6621122.9%
 
J3577412.4%
 
N3454412.0%
 
A2929310.1%
 
S273929.5%
 
E39151.4%
 
K38431.3%
 
O3230.1%
 
T3170.1%
 
P2770.1%
 
D85< 0.1%
 
H72< 0.1%
 
V10< 0.1%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-8290100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
'11263100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin736504693.2%
 
Common5355126.8%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e109034114.8%
 
a7292479.9%
 
n7251019.8%
 
l6320938.6%
 
o5904068.0%
 
r5480567.4%
 
p3908625.3%
 
t2896643.9%
 
i2613773.5%
 
y2094942.8%
 
h2057592.8%
 
s2037912.8%
 
d2001832.7%
 
c1871252.5%
 
m1828642.5%
 
k1597302.2%
 
u1238271.7%
 
g931641.3%
 
L870141.2%
 
w869141.2%
 
C662110.9%
 
v537320.7%
 
f462550.6%
 
b362690.5%
 
J357740.5%
 
Other values (14)1297931.8%
 

Most frequent Common characters

ValueCountFrequency (%) 
51595996.3%
 
'112632.1%
 
-82901.5%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII7900558100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e109034113.8%
 
a7292479.2%
 
n7251019.2%
 
l6320938.0%
 
o5904067.5%
 
r5480566.9%
 
5159596.5%
 
p3908624.9%
 
t2896643.7%
 
i2613773.3%
 
y2094942.7%
 
h2057592.6%
 
s2037912.6%
 
d2001832.5%
 
c1871252.4%
 
m1828642.3%
 
k1597302.0%
 
u1238271.6%
 
g931641.2%
 
L870141.1%
 
w869141.1%
 
C662110.8%
 
v537320.7%
 
f462550.6%
 
b362690.5%
 
Other values (17)1851202.3%
 

steward
Categorical

MISSING

Distinct count4
Unique (%)< 0.1%
Missing31615
Missing (%)4.6%
Memory size5.2 MiB
None
487823
1or2
143557
3or4
 
19183
4orMore
 
1610
ValueCountFrequency (%) 
None48782371.3%
 
1or214355721.0%
 
3or4191832.8%
 
4orMore16100.2%
 
(Missing)316154.6%
 

Length

Max length7
Median length4
Mean length3.960828502
Min length3

Overview of Unicode Properties

Unique unicode characters11
Unique unicode categories (?)3
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
o65378324.1%
 
n55105320.3%
 
e48943318.1%
 
N48782318.0%
 
r1659606.1%
 
11435575.3%
 
21435575.3%
 
a316151.2%
 
4207930.8%
 
3191830.7%
 
M16100.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter189184469.9%
 
Uppercase Letter48943318.1%
 
Decimal Number32709012.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
N48782399.7%
 
M16100.3%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
o65378334.6%
 
n55105329.1%
 
e48943325.9%
 
r1659608.8%
 
a316151.7%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
114355743.9%
 
214355743.9%
 
4207936.4%
 
3191835.9%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin238127787.9%
 
Common32709012.1%
 

Most frequent Latin characters

ValueCountFrequency (%) 
o65378327.5%
 
n55105323.1%
 
e48943320.6%
 
N48782320.5%
 
r1659607.0%
 
a316151.3%
 
M16100.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
114355743.9%
 
214355743.9%
 
4207936.4%
 
3191835.9%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2708367100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
o65378324.1%
 
n55105320.3%
 
e48943318.1%
 
N48782318.0%
 
r1659606.1%
 
11435575.3%
 
21435575.3%
 
a316151.2%
 
4207930.8%
 
3191830.7%
 
M16100.1%
 

guards
Categorical

MISSING

Distinct count4
Unique (%)< 0.1%
Missing31616
Missing (%)4.6%
Memory size5.2 MiB
None
572306
Helpful
 
51866
Harmful
 
20252
Unsure
 
7748
ValueCountFrequency (%) 
None57230683.7%
 
Helpful518667.6%
 
Harmful202523.0%
 
Unsure77481.1%
 
(Missing)316164.6%
 

Length

Max length7
Median length4
Mean length4.292830526
Min length3

Overview of Unicode Properties

Unique unicode characters14
Unique unicode categories (?)2
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
n64328621.9%
 
e63192021.5%
 
N57230619.5%
 
o57230619.5%
 
l1239844.2%
 
u798662.7%
 
H721182.5%
 
f721182.5%
 
a518681.8%
 
p518661.8%
 
r280001.0%
 
m202520.7%
 
U77480.3%
 
s77480.3%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter228321477.8%
 
Uppercase Letter65217222.2%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
N57230687.8%
 
H7211811.1%
 
U77481.2%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n64328628.2%
 
e63192027.7%
 
o57230625.1%
 
l1239845.4%
 
u798663.5%
 
f721183.2%
 
a518682.3%
 
p518662.3%
 
r280001.2%
 
m202520.9%
 
s77480.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin2935386100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
n64328621.9%
 
e63192021.5%
 
N57230619.5%
 
o57230619.5%
 
l1239844.2%
 
u798662.7%
 
H721182.5%
 
f721182.5%
 
a518681.8%
 
p518661.8%
 
r280001.0%
 
m202520.7%
 
U77480.3%
 
s77480.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2935386100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
n64328621.9%
 
e63192021.5%
 
N57230619.5%
 
o57230619.5%
 
l1239844.2%
 
u798662.7%
 
H721182.5%
 
f721182.5%
 
a518681.8%
 
p518661.8%
 
r280001.0%
 
m202520.7%
 
U77480.3%
 
s77480.3%
 

sidewalk
Categorical

MISSING

Distinct count2
Unique (%)< 0.1%
Missing31616
Missing (%)4.6%
Memory size5.2 MiB
NoDamage
464978
Damage
187194
ValueCountFrequency (%) 
NoDamage46497868.0%
 
Damage18719427.4%
 
(Missing)316164.6%
 

Length

Max length8
Median length8
Mean length7.221296659
Min length3

Overview of Unicode Properties

Unique unicode characters8
Unique unicode categories (?)2
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
a133596027.1%
 
D65217213.2%
 
m65217213.2%
 
g65217213.2%
 
e65217213.2%
 
N4649789.4%
 
o4649789.4%
 
n632321.3%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter382068677.4%
 
Uppercase Letter111715022.6%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
D65217258.4%
 
N46497841.6%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
a133596035.0%
 
m65217217.1%
 
g65217217.1%
 
e65217217.1%
 
o46497812.2%
 
n632321.7%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin4937836100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
a133596027.1%
 
D65217213.2%
 
m65217213.2%
 
g65217213.2%
 
e65217213.2%
 
N4649789.4%
 
o4649789.4%
 
n632321.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII4937836100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
a133596027.1%
 
D65217213.2%
 
m65217213.2%
 
g65217213.2%
 
e65217213.2%
 
N4649789.4%
 
o4649789.4%
 
n632321.3%
 

user_type
Categorical

Distinct count3
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.2 MiB
TreesCount Staff
296284
Volunteer
217518
NYC Parks Staff
169986
ValueCountFrequency (%) 
TreesCount Staff29628443.3%
 
Volunteer21751831.8%
 
NYC Parks Staff16998624.9%
 

Length

Max length16
Median length15
Mean length13.52465384
Min length9

Overview of Unicode Properties

Unique unicode characters19
Unique unicode categories (?)3
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e102760411.1%
 
t98007210.6%
 
f93254010.1%
 
r6837887.4%
 
6362566.9%
 
a6362566.9%
 
o5138025.6%
 
u5138025.6%
 
n5138025.6%
 
s4662705.0%
 
C4662705.0%
 
S4662705.0%
 
T2962843.2%
 
V2175182.4%
 
l2175182.4%
 
N1699861.8%
 
Y1699861.8%
 
P1699861.8%
 
k1699861.8%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter665544072.0%
 
Uppercase Letter195630021.2%
 
Space Separator6362566.9%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
C46627023.8%
 
S46627023.8%
 
T29628415.1%
 
V21751811.1%
 
N1699868.7%
 
Y1699868.7%
 
P1699868.7%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e102760415.4%
 
t98007214.7%
 
f93254014.0%
 
r68378810.3%
 
a6362569.6%
 
o5138027.7%
 
u5138027.7%
 
n5138027.7%
 
s4662707.0%
 
l2175183.3%
 
k1699862.6%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
636256100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin861174093.1%
 
Common6362566.9%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e102760411.9%
 
t98007211.4%
 
f93254010.8%
 
r6837887.9%
 
a6362567.4%
 
o5138026.0%
 
u5138026.0%
 
n5138026.0%
 
s4662705.4%
 
C4662705.4%
 
S4662705.4%
 
T2962843.4%
 
V2175182.5%
 
l2175182.5%
 
N1699862.0%
 
Y1699862.0%
 
P1699862.0%
 
k1699862.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
636256100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII9247996100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e102760411.1%
 
t98007210.6%
 
f93254010.1%
 
r6837887.4%
 
6362566.9%
 
a6362566.9%
 
o5138025.6%
 
u5138025.6%
 
n5138025.6%
 
s4662705.0%
 
C4662705.0%
 
S4662705.0%
 
T2962843.2%
 
V2175182.4%
 
l2175182.4%
 
N1699861.8%
 
Y1699861.8%
 
P1699861.8%
 
k1699861.8%
 

problems
Categorical

HIGH CARDINALITY
MISSING

Distinct count232
Unique (%)< 0.1%
Missing31664
Missing (%)4.6%
Memory size5.2 MiB
None
426280
Stones
95673
BranchLights
 
29452
Stones,BranchLights
 
17808
RootOther
 
11418
Other values (227)
 
71493
ValueCountFrequency (%) 
None42628062.3%
 
Stones9567314.0%
 
BranchLights294524.3%
 
Stones,BranchLights178082.6%
 
RootOther114181.7%
 
TrunkOther111431.6%
 
BranchOther83521.2%
 
Stones,TrunkOther51830.8%
 
Stones,RootOther44680.7%
 
WiresRope40950.6%
 
Stones,BranchOther37860.6%
 
TrunkOther,BranchOther24770.4%
 
WiresRope,BranchLights23080.3%
 
RootOther,TrunkOther21370.3%
 
MetalGrates20980.3%
 
Stones,WiresRope,BranchLights19530.3%
 
RootOther,BranchLights19180.3%
 
RootOther,TrunkOther,BranchOther18070.3%
 
TrunkOther,BranchLights15580.2%
 
Stones,TrunkOther,BranchOther15390.2%
 
RootOther,BranchOther15030.2%
 
Stones,WiresRope14570.2%
 
Stones,RootOther,TrunkOther13600.2%
 
Stones,TrunkOther,BranchLights11760.2%
 
Stones,RootOther,BranchLights9010.1%
 
Other values (207)102741.5%
 
(Missing)316644.6%
 

Length

Max length95
Median length4
Mean length6.595658303
Min length3

Overview of Unicode Properties

Unique unicode characters26
Unique unicode categories (?)3
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
n75034216.6%
 
e68797115.3%
 
o64019714.2%
 
N4262809.5%
 
t3280397.3%
 
h2373665.3%
 
r2247955.0%
 
s2206164.9%
 
S1404103.1%
 
a1258672.8%
 
O872501.9%
 
B867201.9%
 
c867201.9%
 
,820221.8%
 
i766701.7%
 
L633961.4%
 
g633961.4%
 
R435961.0%
 
k340150.8%
 
T336040.7%
 
u336040.7%
 
W132740.3%
 
p132740.3%
 
M35360.1%
 
l35360.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter352640878.2%
 
Uppercase Letter90160220.0%
 
Other Punctuation820221.8%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
N42628047.3%
 
S14041015.6%
 
O872509.7%
 
B867209.6%
 
L633967.0%
 
R435964.8%
 
T336043.7%
 
W132741.5%
 
M35360.4%
 
G35360.4%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n75034221.3%
 
e68797119.5%
 
o64019718.2%
 
t3280399.3%
 
h2373666.7%
 
r2247956.4%
 
s2206166.3%
 
a1258673.6%
 
c867202.5%
 
i766702.2%
 
g633961.8%
 
k340151.0%
 
u336041.0%
 
p132740.4%
 
l35360.1%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
,82022100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin442801098.2%
 
Common820221.8%
 

Most frequent Latin characters

ValueCountFrequency (%) 
n75034216.9%
 
e68797115.5%
 
o64019714.5%
 
N4262809.6%
 
t3280397.4%
 
h2373665.4%
 
r2247955.1%
 
s2206165.0%
 
S1404103.2%
 
a1258672.8%
 
O872502.0%
 
B867202.0%
 
c867202.0%
 
i766701.7%
 
L633961.4%
 
g633961.4%
 
R435961.0%
 
k340150.8%
 
T336040.8%
 
u336040.8%
 
W132740.3%
 
p132740.3%
 
M35360.1%
 
l35360.1%
 
G35360.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
,82022100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII4510032100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
n75034216.6%
 
e68797115.3%
 
o64019714.2%
 
N4262809.5%
 
t3280397.3%
 
h2373665.3%
 
r2247955.0%
 
s2206164.9%
 
S1404103.1%
 
a1258672.8%
 
O872501.9%
 
B867201.9%
 
c867201.9%
 
,820221.8%
 
i766701.7%
 
L633961.4%
 
g633961.4%
 
R435961.0%
 
k340150.8%
 
T336040.7%
 
u336040.7%
 
W132740.3%
 
p132740.3%
 
M35360.1%
 
l35360.1%
 

root_stone
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.2 MiB
No
543789
Yes
139999
ValueCountFrequency (%) 
No54378979.5%
 
Yes13999920.5%
 

root_grate
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.2 MiB
No
680252
Yes
 
3536
ValueCountFrequency (%) 
No68025299.5%
 
Yes35360.5%
 

root_other
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.2 MiB
No
653466
Yes
 
30322
ValueCountFrequency (%) 
No65346695.6%
 
Yes303224.4%
 

trunk_wire
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.2 MiB
No
670514
Yes
 
13274
ValueCountFrequency (%) 
No67051498.1%
 
Yes132741.9%
 

trnk_light
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.2 MiB
No
682757
Yes
 
1031
ValueCountFrequency (%) 
No68275799.8%
 
Yes10310.2%
 

trnk_other
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.2 MiB
No
651215
Yes
 
32573
ValueCountFrequency (%) 
No65121595.2%
 
Yes325734.8%
 

brch_light
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.2 MiB
No
621423
Yes
 
62365
ValueCountFrequency (%) 
No62142390.9%
 
Yes623659.1%
 

brch_shoe
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.2 MiB
No
683377
Yes
 
411
ValueCountFrequency (%) 
No68337799.9%
 
Yes4110.1%
 

brch_other
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.2 MiB
No
659433
Yes
 
24355
ValueCountFrequency (%) 
No65943396.4%
 
Yes243553.6%
 

address
Categorical

HIGH CARDINALITY

Distinct count408701
Unique (%)59.8%
Missing0
Missing (%)0.0%
Memory size5.2 MiB
106 CROSS BAY BOULEVARD
 
262
2 TELEPORT DRIVE
 
143
2350 FOREST HILL ROAD
 
140
900 SOUTH AVENUE
 
90
2750 VETERANS ROAD WEST
 
84
Other values (408696)
683069
ValueCountFrequency (%) 
106 CROSS BAY BOULEVARD262< 0.1%
 
2 TELEPORT DRIVE143< 0.1%
 
2350 FOREST HILL ROAD140< 0.1%
 
900 SOUTH AVENUE90< 0.1%
 
2750 VETERANS ROAD WEST84< 0.1%
 
79-25 WINCHESTER BOULEVARD79< 0.1%
 
1200 SOUTH AVENUE74< 0.1%
 
69-053 210 STREET74< 0.1%
 
204 EDWARD CURRY AVENUE67< 0.1%
 
164-055 96 STREET64< 0.1%
 
120 CASALS PLACE63< 0.1%
 
3034 GOULDEN AVENUE61< 0.1%
 
28-055 ULMER STREET59< 0.1%
 
79-20 WINCHESTER BOULEVARD57< 0.1%
 
402 HUNTS POINT AVENUE56< 0.1%
 
3671 HUDSON MANOR TERRACE55< 0.1%
 
224 WEST STREET55< 0.1%
 
96 PARKSIDE AVENUE53< 0.1%
 
146-056 SPRINGFIELD BOULEVARD53< 0.1%
 
9000 BAY PARKWAY51< 0.1%
 
610 RIVER AVENUE50< 0.1%
 
2925 VETERANS ROAD WEST50< 0.1%
 
151 MAGUIRE AVENUE49< 0.1%
 
2200 ARTHUR KILL ROAD49< 0.1%
 
73-004 199 STREET48< 0.1%
 
Other values (408676)68190299.7%
 

Length

Max length40
Median length18
Mean length18.02302176
Min length1

Overview of Unicode Properties

Unique unicode characters39
Unique unicode categories (?)5
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
E156499512.7%
 
150377312.2%
 
T8473176.9%
 
A6800935.5%
 
R6322915.1%
 
16256035.1%
 
05486674.5%
 
S5385604.4%
 
N5061954.1%
 
24221153.4%
 
U3521442.9%
 
V3244242.6%
 
33071782.5%
 
42970432.4%
 
O2902072.4%
 
52873072.3%
 
62450672.0%
 
L2391441.9%
 
-2389241.9%
 
72255371.8%
 
82210161.8%
 
91961501.6%
 
D1809711.5%
 
I1665211.4%
 
C1425861.2%
 
Other values (14)7400986.0%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter720535958.5%
 
Decimal Number337568327.4%
 
Space Separator150377312.2%
 
Dash Punctuation2389241.9%
 
Other Punctuation187< 0.1%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
162560318.5%
 
054866716.3%
 
242211512.5%
 
33071789.1%
 
42970438.8%
 
52873078.5%
 
62450677.3%
 
72255376.7%
 
82210166.5%
 
91961505.8%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-238924100.0%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
1503773100.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
E156499521.7%
 
T84731711.8%
 
A6800939.4%
 
R6322918.8%
 
S5385607.5%
 
N5061957.0%
 
U3521444.9%
 
V3244244.5%
 
O2902074.0%
 
L2391443.3%
 
D1809712.5%
 
I1665212.3%
 
C1425862.0%
 
W1047151.5%
 
H1018131.4%
 
B977221.4%
 
P890281.2%
 
M818491.1%
 
Y734211.0%
 
G664830.9%
 
K599710.8%
 
F413170.6%
 
J95950.1%
 
X77120.1%
 
Q3335< 0.1%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
'187100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin720535958.5%
 
Common511856741.5%
 

Most frequent Common characters

ValueCountFrequency (%) 
150377329.4%
 
162560312.2%
 
054866710.7%
 
24221158.2%
 
33071786.0%
 
42970435.8%
 
52873075.6%
 
62450674.8%
 
-2389244.7%
 
72255374.4%
 
82210164.3%
 
91961503.8%
 
'187< 0.1%
 

Most frequent Latin characters

ValueCountFrequency (%) 
E156499521.7%
 
T84731711.8%
 
A6800939.4%
 
R6322918.8%
 
S5385607.5%
 
N5061957.0%
 
U3521444.9%
 
V3244244.5%
 
O2902074.0%
 
L2391443.3%
 
D1809712.5%
 
I1665212.3%
 
C1425862.0%
 
W1047151.5%
 
H1018131.4%
 
B977221.4%
 
P890281.2%
 
M818491.1%
 
Y734211.0%
 
G664830.9%
 
K599710.8%
 
F413170.6%
 
J95950.1%
 
X77120.1%
 
Q3335< 0.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII12323926100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
E156499512.7%
 
150377312.2%
 
T8473176.9%
 
A6800935.5%
 
R6322915.1%
 
16256035.1%
 
05486674.5%
 
S5385604.4%
 
N5061954.1%
 
24221153.4%
 
U3521442.9%
 
V3244242.6%
 
33071782.5%
 
42970432.4%
 
O2902072.4%
 
52873072.3%
 
62450672.0%
 
L2391441.9%
 
-2389241.9%
 
72255371.8%
 
82210161.8%
 
91961501.6%
 
D1809711.5%
 
I1665211.4%
 
C1425861.2%
 
Other values (14)7400986.0%
 

postcode
Real number (ℝ≥0)

Distinct count191
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10916.246044095538
Minimum83
Maximum11697
Zeros0
Zeros (%)0.0%
Memory size5.2 MiB

Quantile statistics

Minimum83
5-th percentile10025
Q110451
median11214
Q311365
95-th percentile11432
Maximum11697
Range11614
Interquartile range (IQR)914

Descriptive statistics

Standard deviation651.5533636
Coefficient of variation (CV)0.05968657732
Kurtosis102.1137958
Mean10916.24604
Median Absolute Deviation (MAD)203
Skewness-6.507755595
Sum7464398050
Variance424521.7856
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
10312221863.2%
 
10314169052.5%
 
10306130301.9%
 
10309126501.8%
 
11234112531.6%
 
11385109371.6%
 
1135794491.4%
 
1120786341.3%
 
1143482741.2%
 
1120882451.2%
 
1141374701.1%
 
1136574281.1%
 
1123073361.1%
 
1030871161.0%
 
1137570591.0%
 
1136470541.0%
 
1143269581.0%
 
1135869281.0%
 
1123668591.0%
 
1030568551.0%
 
1046967101.0%
 
1121563600.9%
 
1142263240.9%
 
1120962010.9%
 
1136161810.9%
 
Other values (166)45938667.2%
 
ValueCountFrequency (%) 
839350.1%
 
100019110.1%
 
1000222650.3%
 
1000320250.3%
 
10004118< 0.1%
 
10005144< 0.1%
 
1000653< 0.1%
 
100073550.1%
 
1000919240.3%
 
100108890.1%
 
ValueCountFrequency (%) 
1169730< 0.1%
 
1169435720.5%
 
1169311690.2%
 
1169220130.3%
 
1169157180.8%
 
1145112< 0.1%
 
1143624070.4%
 
1143545950.7%
 
1143482741.2%
 
1143337450.5%
 

zip_city
Categorical

HIGH CORRELATION

Distinct count48
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.2 MiB
Brooklyn
177300
Staten Island
105318
Bronx
85203
New York
64488
Jamaica
 
26028
Other values (43)
225451
ValueCountFrequency (%) 
Brooklyn17730025.9%
 
Staten Island10531815.4%
 
Bronx8520312.5%
 
New York644889.4%
 
Jamaica260283.8%
 
Flushing233893.4%
 
Ridgewood109371.6%
 
Fresh Meadows104411.5%
 
Queens Village101271.5%
 
Astoria100071.5%
 
Whitestone94491.4%
 
Bayside86791.3%
 
Springfield Gardens74701.1%
 
Little Neck72801.1%
 
Forest Hills70591.0%
 
Oakland Gardens70541.0%
 
Far Rockaway68871.0%
 
East Elmhurst64750.9%
 
Rosedale63240.9%
 
Woodside56510.8%
 
South Ozone Park54940.8%
 
Ozone Park54560.8%
 
Bellerose51210.7%
 
Middle Village48930.7%
 
Saint Albans46970.7%
 
Other values (23)625619.1%
 

Length

Max length19
Median length8
Mean length9.315960502
Min length5

Overview of Unicode Properties

Unique unicode characters46
Unique unicode categories (?)3
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
o66205810.4%
 
n6044339.5%
 
a4745557.4%
 
l4461987.0%
 
r4406226.9%
 
e4009696.3%
 
t3094104.9%
 
3056284.8%
 
k2949514.6%
 
s2838994.5%
 
B2809834.4%
 
d2242333.5%
 
y2024763.2%
 
i1820372.9%
 
S1274482.0%
 
I1088061.7%
 
w1035921.6%
 
h865151.4%
 
x852031.3%
 
g730021.1%
 
N726331.1%
 
Y644881.0%
 
c579080.9%
 
u541580.9%
 
F493150.8%
 
Other values (21)3746225.9%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter507509879.7%
 
Uppercase Letter98941615.5%
 
Space Separator3056284.8%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
B28098328.4%
 
S12744812.9%
 
I10880611.0%
 
N726337.3%
 
Y644886.5%
 
F493155.0%
 
R370003.7%
 
J293233.0%
 
H288852.9%
 
P240742.4%
 
O221342.2%
 
G203972.1%
 
M193672.0%
 
W179551.8%
 
E171541.7%
 
A167171.7%
 
C150881.5%
 
V150201.5%
 
L107591.1%
 
Q101271.0%
 
K17430.2%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
o66205813.0%
 
n60443311.9%
 
a4745559.4%
 
l4461988.8%
 
r4406228.7%
 
e4009697.9%
 
t3094106.1%
 
k2949515.8%
 
s2838995.6%
 
d2242334.4%
 
y2024764.0%
 
i1820373.6%
 
w1035922.0%
 
h865151.7%
 
x852031.7%
 
g730021.4%
 
c579081.1%
 
u541581.1%
 
m461320.9%
 
p115030.2%
 
z109800.2%
 
b79260.2%
 
f74700.1%
 
v48680.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
305628100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin606451495.2%
 
Common3056284.8%
 

Most frequent Latin characters

ValueCountFrequency (%) 
o66205810.9%
 
n60443310.0%
 
a4745557.8%
 
l4461987.4%
 
r4406227.3%
 
e4009696.6%
 
t3094105.1%
 
k2949514.9%
 
s2838994.7%
 
B2809834.6%
 
d2242333.7%
 
y2024763.3%
 
i1820373.0%
 
S1274482.1%
 
I1088061.8%
 
w1035921.7%
 
h865151.4%
 
x852031.4%
 
g730021.2%
 
N726331.2%
 
Y644881.1%
 
c579081.0%
 
u541580.9%
 
F493150.8%
 
m461320.8%
 
Other values (20)3284905.4%
 

Most frequent Common characters

ValueCountFrequency (%) 
305628100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII6370142100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
o66205810.4%
 
n6044339.5%
 
a4745557.4%
 
l4461987.0%
 
r4406226.9%
 
e4009696.3%
 
t3094104.9%
 
3056284.8%
 
k2949514.6%
 
s2838994.5%
 
B2809834.4%
 
d2242333.5%
 
y2024763.2%
 
i1820372.9%
 
S1274482.0%
 
I1088061.7%
 
w1035921.6%
 
h865151.4%
 
x852031.3%
 
g730021.1%
 
N726331.1%
 
Y644881.0%
 
c579080.9%
 
u541580.9%
 
F493150.8%
 
Other values (21)3746225.9%
 

community board
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count59
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean343.50540372162135
Minimum101
Maximum503
Zeros0
Zeros (%)0.0%
Memory size5.2 MiB

Quantile statistics

Minimum101
5-th percentile108
Q1302
median402
Q3412
95-th percentile503
Maximum503
Range402
Interquartile range (IQR)110

Descriptive statistics

Standard deviation115.7406006
Coefficient of variation (CV)0.3369396794
Kurtosis-0.5111178327
Mean343.5054037
Median Absolute Deviation (MAD)92
Skewness-0.5610389849
Sum234884873
Variance13395.88663
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
503539347.9%
 
413370175.4%
 
407306204.5%
 
411280714.1%
 
412263793.9%
 
502257173.8%
 
501256673.8%
 
408203833.0%
 
405195502.9%
 
318193192.8%
 
305165892.4%
 
410152242.2%
 
315137582.0%
 
401130081.9%
 
312125001.8%
 
414124121.8%
 
301120661.8%
 
409114811.7%
 
210111801.6%
 
212111081.6%
 
403108271.6%
 
310105981.5%
 
314105361.5%
 
306102901.5%
 
406102581.5%
 
Other values (34)21529631.5%
 
ValueCountFrequency (%) 
10123970.4%
 
10250190.7%
 
10349390.7%
 
10447040.7%
 
10521560.3%
 
10650610.7%
 
10788141.3%
 
10892691.4%
 
10949870.7%
 
11059620.9%
 
ValueCountFrequency (%) 
503539347.9%
 
502257173.8%
 
501256673.8%
 
414124121.8%
 
413370175.4%
 
412263793.9%
 
411280714.1%
 
410152242.2%
 
409114811.7%
 
408203833.0%
 

borocode
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count5
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.3585000029248833
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size5.2 MiB

Quantile statistics

Minimum1
5-th percentile1
Q13
median4
Q34
95-th percentile5
Maximum5
Range4
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.1667456
Coefficient of variation (CV)0.3474008037
Kurtosis-0.5342007856
Mean3.358500003
Median Absolute Deviation (MAD)1
Skewness-0.5046858172
Sum2296502
Variance1.361295296
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
425055136.6%
 
317729325.9%
 
510531815.4%
 
28520312.5%
 
1654239.6%
 
ValueCountFrequency (%) 
1654239.6%
 
28520312.5%
 
317729325.9%
 
425055136.6%
 
510531815.4%
 
ValueCountFrequency (%) 
510531815.4%
 
425055136.6%
 
317729325.9%
 
28520312.5%
 
1654239.6%
 

borough
Categorical

HIGH CORRELATION

Distinct count5
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.2 MiB
Queens
250551
Brooklyn
177293
Staten Island
105318
Bronx
85203
Manhattan
65423
ValueCountFrequency (%) 
Queens25055136.6%
 
Brooklyn17729325.9%
 
Staten Island10531815.4%
 
Bronx8520312.5%
 
Manhattan654239.6%
 

Length

Max length13
Median length8
Mean length7.759138797
Min length5

Overview of Unicode Properties

Unique unicode characters20
Unique unicode categories (?)3
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
n85452916.1%
 
e60642011.4%
 
o4397898.3%
 
a4069057.7%
 
s3558696.7%
 
t3414826.4%
 
l2826115.3%
 
B2624964.9%
 
r2624964.9%
 
Q2505514.7%
 
u2505514.7%
 
k1772933.3%
 
y1772933.3%
 
S1053182.0%
 
1053182.0%
 
I1053182.0%
 
d1053182.0%
 
x852031.6%
 
M654231.2%
 
h654231.2%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter441118283.1%
 
Uppercase Letter78910614.9%
 
Space Separator1053182.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
B26249633.3%
 
Q25055131.8%
 
S10531813.3%
 
I10531813.3%
 
M654238.3%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n85452919.4%
 
e60642013.7%
 
o43978910.0%
 
a4069059.2%
 
s3558698.1%
 
t3414827.7%
 
l2826116.4%
 
r2624966.0%
 
u2505515.7%
 
k1772934.0%
 
y1772934.0%
 
d1053182.4%
 
x852031.9%
 
h654231.5%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
105318100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin520028898.0%
 
Common1053182.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
n85452916.4%
 
e60642011.7%
 
o4397898.5%
 
a4069057.8%
 
s3558696.8%
 
t3414826.6%
 
l2826115.4%
 
B2624965.0%
 
r2624965.0%
 
Q2505514.8%
 
u2505514.8%
 
k1772933.4%
 
y1772933.4%
 
S1053182.0%
 
I1053182.0%
 
d1053182.0%
 
x852031.6%
 
M654231.3%
 
h654231.3%
 

Most frequent Common characters

ValueCountFrequency (%) 
105318100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII5305606100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
n85452916.1%
 
e60642011.4%
 
o4397898.3%
 
a4069057.7%
 
s3558696.7%
 
t3414826.4%
 
l2826115.3%
 
B2624964.9%
 
r2624964.9%
 
Q2505514.7%
 
u2505514.7%
 
k1772933.3%
 
y1772933.3%
 
S1053182.0%
 
1053182.0%
 
I1053182.0%
 
d1053182.0%
 
x852031.6%
 
M654231.2%
 
h654231.2%
 

cncldist
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count51
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29.943181219910265
Minimum1
Maximum51
Zeros0
Zeros (%)0.0%
Memory size5.2 MiB

Quantile statistics

Minimum1
5-th percentile6
Q119
median30
Q343
95-th percentile51
Maximum51
Range50
Interquartile range (IQR)24

Descriptive statistics

Standard deviation14.32853127
Coefficient of variation (CV)0.4785240139
Kurtosis-1.050565109
Mean29.94318122
Median Absolute Deviation (MAD)12
Skewness-0.1089933201
Sum20474788
Variance205.3068082
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
51512367.5%
 
19344295.0%
 
50330354.8%
 
23307434.5%
 
31231613.4%
 
49210473.1%
 
27201162.9%
 
32195082.9%
 
24189932.8%
 
30185512.7%
 
13175872.6%
 
46169132.5%
 
28158072.3%
 
20141892.1%
 
29139462.0%
 
39138592.0%
 
43131961.9%
 
42131171.9%
 
33127841.9%
 
17118511.7%
 
48117861.7%
 
45117581.7%
 
44116591.7%
 
22115371.7%
 
37109901.6%
 
Other values (26)21199031.0%
 
ValueCountFrequency (%) 
156940.8%
 
255640.8%
 
386311.3%
 
485211.2%
 
549820.7%
 
680501.2%
 
765721.0%
 
872931.1%
 
982131.2%
 
1065011.0%
 
ValueCountFrequency (%) 
51512367.5%
 
50330354.8%
 
49210473.1%
 
48117861.7%
 
4792591.4%
 
46169132.5%
 
45117581.7%
 
44116591.7%
 
43131961.9%
 
42131171.9%
 

st_assem
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count65
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.791583063756605
Minimum23
Maximum87
Zeros0
Zeros (%)0.0%
Memory size5.2 MiB

Quantile statistics

Minimum23
5-th percentile24
Q133
median52
Q364
95-th percentile83
Maximum87
Range64
Interquartile range (IQR)31

Descriptive statistics

Standard deviation18.96652002
Coefficient of variation (CV)0.3734185642
Kurtosis-1.167528325
Mean50.79158306
Median Absolute Deviation (MAD)16
Skewness0.1590070245
Sum34730675
Variance359.7288818
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
62460026.7%
 
26277634.1%
 
64229223.4%
 
63218833.2%
 
25215143.1%
 
33209273.1%
 
61176692.6%
 
24176432.6%
 
23176152.6%
 
29175722.6%
 
27148532.2%
 
31144512.1%
 
28137892.0%
 
32134432.0%
 
59132231.9%
 
82128871.9%
 
30116891.7%
 
52116251.7%
 
60113651.7%
 
41106741.6%
 
37102751.5%
 
40102571.5%
 
44100831.5%
 
3899671.5%
 
5096711.4%
 
Other values (40)27402640.1%
 
ValueCountFrequency (%) 
23176152.6%
 
24176432.6%
 
25215143.1%
 
26277634.1%
 
27148532.2%
 
28137892.0%
 
29175722.6%
 
30116891.7%
 
31144512.1%
 
32134432.0%
 
ValueCountFrequency (%) 
8774281.1%
 
8655300.8%
 
8565431.0%
 
8482531.2%
 
8388081.3%
 
82128871.9%
 
8185371.2%
 
8093141.4%
 
7970831.0%
 
7851820.8%
 

st_senate
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count26
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20.615781499529096
Minimum10
Maximum36
Zeros0
Zeros (%)0.0%
Memory size5.2 MiB

Quantile statistics

Minimum10
5-th percentile10
Q114
median21
Q325
95-th percentile34
Maximum36
Range26
Interquartile range (IQR)11

Descriptive statistics

Standard deviation7.390843831
Coefficient of variation (CV)0.3585041795
Kurtosis-0.9483978839
Mean20.6157815
Median Absolute Deviation (MAD)6
Skewness0.2993718667
Sum14096824
Variance54.62457253
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
248724112.8%
 
11677059.9%
 
15446246.5%
 
14383515.6%
 
10350445.1%
 
34310664.5%
 
19290834.3%
 
22271794.0%
 
23254013.7%
 
16241463.5%
 
25234993.4%
 
17222253.3%
 
12218473.2%
 
21207303.0%
 
18206033.0%
 
13188272.8%
 
20168482.5%
 
26164562.4%
 
32159152.3%
 
36145442.1%
 
33144622.1%
 
29142852.1%
 
30139292.0%
 
27136852.0%
 
28131951.9%
 
ValueCountFrequency (%) 
10350445.1%
 
11677059.9%
 
12218473.2%
 
13188272.8%
 
14383515.6%
 
15446246.5%
 
16241463.5%
 
17222253.3%
 
18206033.0%
 
19290834.3%
 
ValueCountFrequency (%) 
36145442.1%
 
34310664.5%
 
33144622.1%
 
32159152.3%
 
31128981.9%
 
30139292.0%
 
29142852.1%
 
28131951.9%
 
27136852.0%
 
26164562.4%
 

nta
Categorical

HIGH CARDINALITY

Distinct count188
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.2 MiB
SI01
 
12969
SI54
 
10734
QN46
 
9780
BK82
 
9607
SI32
 
9251
Other values (183)
631447
ValueCountFrequency (%) 
SI01129691.9%
 
SI54107341.6%
 
QN4697801.4%
 
BK8296071.4%
 
SI3292511.4%
 
SI0584461.2%
 
SI1182161.2%
 
QN1777011.1%
 
QN4976201.1%
 
BK4574491.1%
 
QN5573211.1%
 
QN4573021.1%
 
QN0872391.1%
 
QN5171331.0%
 
BK3170261.0%
 
SI4869991.0%
 
QN2868631.0%
 
QN3468561.0%
 
QN4468141.0%
 
QN2064810.9%
 
BK6164110.9%
 
BK3763140.9%
 
QN4362460.9%
 
BK4261160.9%
 
QN4260590.9%
 
Other values (163)49083571.8%
 

Length

Max length4
Median length4
Mean length4
Min length4

Overview of Unicode Properties

Unique unicode characters18
Unique unicode categories (?)2
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
N31619311.6%
 
B2622779.6%
 
Q2505249.2%
 
31876856.9%
 
21825886.7%
 
K1773206.5%
 
41749786.4%
 
11676146.1%
 
51601685.9%
 
01388205.1%
 
S1053183.9%
 
I1053183.9%
 
71043023.8%
 
6975303.6%
 
8975143.6%
 
X849573.1%
 
M656692.4%
 
9563772.1%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter136757650.0%
 
Decimal Number136757650.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
N31619323.1%
 
B26227719.2%
 
Q25052418.3%
 
K17732013.0%
 
S1053187.7%
 
I1053187.7%
 
X849576.2%
 
M656694.8%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
318768513.7%
 
218258813.4%
 
417497812.8%
 
116761412.3%
 
516016811.7%
 
013882010.2%
 
71043027.6%
 
6975307.1%
 
8975147.1%
 
9563774.1%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin136757650.0%
 
Common136757650.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
N31619323.1%
 
B26227719.2%
 
Q25052418.3%
 
K17732013.0%
 
S1053187.7%
 
I1053187.7%
 
X849576.2%
 
M656694.8%
 

Most frequent Common characters

ValueCountFrequency (%) 
318768513.7%
 
218258813.4%
 
417497812.8%
 
116761412.3%
 
516016811.7%
 
013882010.2%
 
71043027.6%
 
6975307.1%
 
8975147.1%
 
9563774.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2735152100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
N31619311.6%
 
B2622779.6%
 
Q2505249.2%
 
31876856.9%
 
21825886.7%
 
K1773206.5%
 
41749786.4%
 
11676146.1%
 
51601685.9%
 
01388205.1%
 
S1053183.9%
 
I1053183.9%
 
71043023.8%
 
6975303.6%
 
8975143.6%
 
X849573.1%
 
M656692.4%
 
9563772.1%
 

nta_name
Categorical

HIGH CARDINALITY

Distinct count188
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.2 MiB
Annadale-Huguenot-Prince's Bay-Eltingville
 
12969
Great Kills
 
10734
Bayside-Bayside Hills
 
9780
East New York
 
9607
Rossville-Woodrow
 
9251
Other values (183)
631447
ValueCountFrequency (%) 
Annadale-Huguenot-Prince's Bay-Eltingville129691.9%
 
Great Kills107341.6%
 
Bayside-Bayside Hills97801.4%
 
East New York96071.4%
 
Rossville-Woodrow92511.4%
 
New Springville-Bloomfield-Travis84461.2%
 
Charleston-Richmond Valley-Tottenville82161.2%
 
Forest Hills77011.1%
 
Whitestone76201.1%
 
Georgetown-Marine Park-Bergen Beach-Mill Basin74491.1%
 
South Ozone Park73211.1%
 
Douglas Manor-Douglaston-Little Neck73021.1%
 
St. Albans72391.1%
 
Murray Hill71331.0%
 
Bay Ridge70261.0%
 
Arden Heights69991.0%
 
Jackson Heights68631.0%
 
Queens Village68561.0%
 
Glen Oaks-Floral Park-New Hyde Park68141.0%
 
Ridgewood64810.9%
 
Crown Heights North64110.9%
 
Park Slope-Gowanus63140.9%
 
Bellerose62460.9%
 
Flatbush61160.9%
 
Oakland Gardens60590.9%
 
Other values (163)49083571.8%
 

Length

Max length56
Median length16
Mean length19.83679737
Min length6

Overview of Unicode Properties

Unique unicode characters55
Unique unicode categories (?)7
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e12153599.0%
 
a9636377.1%
 
l8800736.5%
 
o8537636.3%
 
8086466.0%
 
r8042395.9%
 
i7679665.7%
 
n7381605.4%
 
t7327925.4%
 
s7088875.2%
 
-4591533.4%
 
d4030363.0%
 
h3813582.8%
 
g3014412.2%
 
u2745622.0%
 
H2542611.9%
 
B2267331.7%
 
w2135881.6%
 
k1966151.4%
 
y1840191.4%
 
c1836131.4%
 
S1608741.2%
 
P1549001.1%
 
m1332291.0%
 
C1278590.9%
 
Other values (30)143540110.6%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter1028951375.9%
 
Uppercase Letter196787714.5%
 
Space Separator8086466.0%
 
Dash Punctuation4591533.4%
 
Other Punctuation329930.2%
 
Open Punctuation2991< 0.1%
 
Close Punctuation2991< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
H25426112.9%
 
B22673311.5%
 
S1608748.2%
 
P1549007.9%
 
C1278596.5%
 
M1083905.5%
 
G963994.9%
 
E959494.9%
 
N956054.9%
 
W942214.8%
 
R804784.1%
 
F707443.6%
 
O545612.8%
 
A542032.8%
 
V477872.4%
 
T470702.4%
 
L445262.3%
 
D361471.8%
 
K250881.3%
 
U241731.2%
 
J227241.2%
 
Y178550.9%
 
I160880.8%
 
Q112420.6%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e121535911.8%
 
a9636379.4%
 
l8800738.6%
 
o8537638.3%
 
r8042397.8%
 
i7679667.5%
 
n7381607.2%
 
t7327927.1%
 
s7088876.9%
 
d4030363.9%
 
h3813583.7%
 
g3014412.9%
 
u2745622.7%
 
w2135882.1%
 
k1966151.9%
 
y1840191.8%
 
c1836131.8%
 
m1332291.3%
 
v1249151.2%
 
p935090.9%
 
b700190.7%
 
f336160.3%
 
z149160.1%
 
q81960.1%
 
x80050.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
808646100.0%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-459153100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
'1674550.8%
 
.1624849.2%
 

Most frequent Open Punctuation characters

ValueCountFrequency (%) 
(2991100.0%
 

Most frequent Close Punctuation characters

ValueCountFrequency (%) 
)2991100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin1225739090.4%
 
Common13067749.6%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e12153599.9%
 
a9636377.9%
 
l8800737.2%
 
o8537637.0%
 
r8042396.6%
 
i7679666.3%
 
n7381606.0%
 
t7327926.0%
 
s7088875.8%
 
d4030363.3%
 
h3813583.1%
 
g3014412.5%
 
u2745622.2%
 
H2542612.1%
 
B2267331.8%
 
w2135881.7%
 
k1966151.6%
 
y1840191.5%
 
c1836131.5%
 
S1608741.3%
 
P1549001.3%
 
m1332291.1%
 
C1278591.0%
 
v1249151.0%
 
M1083900.9%
 
Other values (24)11631219.5%
 

Most frequent Common characters

ValueCountFrequency (%) 
80864661.9%
 
-45915335.1%
 
'167451.3%
 
.162481.2%
 
(29910.2%
 
)29910.2%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII13564164100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e12153599.0%
 
a9636377.1%
 
l8800736.5%
 
o8537636.3%
 
8086466.0%
 
r8042395.9%
 
i7679665.7%
 
n7381605.4%
 
t7327925.4%
 
s7088875.2%
 
-4591533.4%
 
d4030363.0%
 
h3813582.8%
 
g3014412.2%
 
u2745622.0%
 
H2542611.9%
 
B2267331.7%
 
w2135881.6%
 
k1966151.4%
 
y1840191.4%
 
c1836131.4%
 
S1608741.2%
 
P1549001.1%
 
m1332291.0%
 
C1278590.9%
 
Other values (30)143540110.6%
 

boro_ct
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count2152
Unique (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3404914.118364171
Minimum1000201
Maximum5032300
Zeros0
Zeros (%)0.0%
Memory size5.2 MiB

Quantile statistics

Minimum1000201
5-th percentile1015700
Q13011700
median4008100
Q34103202
95-th percentile5019800
Maximum5032300
Range4032099
Interquartile range (IQR)1091502

Descriptive statistics

Standard deviation1175863.419
Coefficient of variation (CV)0.3453430477
Kurtosis-0.5389642631
Mean3404914.118
Median Absolute Deviation (MAD)964100
Skewness-0.5502694254
Sum2.328239415e+12
Variance1.382654779e+12
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
502080137760.6%
 
501760032450.5%
 
502080327490.4%
 
502080427230.4%
 
501980026200.4%
 
502260024520.4%
 
501700524020.4%
 
408920023990.4%
 
502440123920.3%
 
501701022690.3%
 
406640021020.3%
 
415710120980.3%
 
501700820960.3%
 
501380020310.3%
 
415790120210.3%
 
501700919970.3%
 
501560119410.3%
 
502440219010.3%
 
501560218470.3%
 
414830018060.3%
 
501460417960.3%
 
200930017420.3%
 
415790217070.2%
 
502910316360.2%
 
412770016040.2%
 
Other values (2127)62843691.9%
 
ValueCountFrequency (%) 
100020170< 0.1%
 
1000202217< 0.1%
 
1000600187< 0.1%
 
1000700144< 0.1%
 
1000800288< 0.1%
 
100090083< 0.1%
 
100100124< 0.1%
 
100100256< 0.1%
 
1001200111< 0.1%
 
100130094< 0.1%
 
ValueCountFrequency (%) 
5032300114< 0.1%
 
50319025110.1%
 
50319013930.1%
 
50303027690.1%
 
50303015720.1%
 
502910412940.2%
 
502910316360.2%
 
50291027520.1%
 
50279004160.1%
 
50277065290.1%
 

state
Categorical

CONSTANT
REJECTED

Distinct count1
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.2 MiB
New York
683788
ValueCountFrequency (%) 
New York683788100.0%
 

Length

Max length8
Median length8
Mean length8
Min length8

Overview of Unicode Properties

Unique unicode characters8
Unique unicode categories (?)3
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
N68378812.5%
 
e68378812.5%
 
w68378812.5%
 
68378812.5%
 
Y68378812.5%
 
o68378812.5%
 
r68378812.5%
 
k68378812.5%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter341894062.5%
 
Uppercase Letter136757625.0%
 
Space Separator68378812.5%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
N68378850.0%
 
Y68378850.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e68378820.0%
 
w68378820.0%
 
o68378820.0%
 
r68378820.0%
 
k68378820.0%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
683788100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin478651687.5%
 
Common68378812.5%
 

Most frequent Latin characters

ValueCountFrequency (%) 
N68378814.3%
 
e68378814.3%
 
w68378814.3%
 
Y68378814.3%
 
o68378814.3%
 
r68378814.3%
 
k68378814.3%
 

Most frequent Common characters

ValueCountFrequency (%) 
683788100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII5470304100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
N68378812.5%
 
e68378812.5%
 
w68378812.5%
 
68378812.5%
 
Y68378812.5%
 
o68378812.5%
 
r68378812.5%
 
k68378812.5%
 

latitude
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count676080
Unique (%)98.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40.701261475468044
Minimum40.49846614
Maximum40.91291831
Zeros0
Zeros (%)0.0%
Memory size5.2 MiB

Quantile statistics

Minimum40.49846614
5-th percentile40.54844736
Q140.6319285
median40.70061174
Q340.76222798
95-th percentile40.85640711
Maximum40.91291831
Range0.41445217
Interquartile range (IQR)0.130299485

Descriptive statistics

Standard deviation0.09031135514
Coefficient of variation (CV)0.002218883442
Kurtosis-0.6338341145
Mean40.70126148
Median Absolute Deviation (MAD)0.0650372
Skewness0.06273807884
Sum27831034.18
Variance0.008156140868
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
40.6603593835< 0.1%
 
40.6109770928< 0.1%
 
40.68946117< 0.1%
 
40.6159659317< 0.1%
 
40.7784969311< 0.1%
 
40.660362349< 0.1%
 
40.88303419< 0.1%
 
40.855716748< 0.1%
 
40.77151088< 0.1%
 
40.722637917< 0.1%
 
40.632644697< 0.1%
 
40.787503777< 0.1%
 
40.636095876< 0.1%
 
40.636077666< 0.1%
 
40.867658196< 0.1%
 
40.689934046< 0.1%
 
40.718196785< 0.1%
 
40.728586655< 0.1%
 
40.851191535< 0.1%
 
40.693124465< 0.1%
 
40.693349685< 0.1%
 
40.693127715< 0.1%
 
40.762546795< 0.1%
 
40.693382544< 0.1%
 
40.622726824< 0.1%
 
Other values (676055)683558> 99.9%
 
ValueCountFrequency (%) 
40.498466141< 0.1%
 
40.498471261< 0.1%
 
40.498509581< 0.1%
 
40.498542951< 0.1%
 
40.49859871< 0.1%
 
40.498749561< 0.1%
 
40.498793521< 0.1%
 
40.498812261< 0.1%
 
40.498816781< 0.1%
 
40.498871481< 0.1%
 
ValueCountFrequency (%) 
40.912918311< 0.1%
 
40.912806761< 0.1%
 
40.912717851< 0.1%
 
40.912614391< 0.1%
 
40.912605411< 0.1%
 
40.91243461< 0.1%
 
40.912368291< 0.1%
 
40.912208691< 0.1%
 
40.912173961< 0.1%
 
40.912151841< 0.1%
 

longitude
Real number (ℝ)

HIGH CORRELATION

Distinct count677101
Unique (%)99.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-73.92405951203185
Minimum-74.2549647
Maximum-73.70048817
Zeros0
Zeros (%)0.0%
Memory size5.2 MiB

Quantile statistics

Minimum-74.2549647
5-th percentile-74.16993065
Q1-73.98050019
median-73.91291141
Q3-73.83491039
95-th percentile-73.74446944
Maximum-73.70048817
Range0.55447653
Interquartile range (IQR)0.1455897975

Descriptive statistics

Standard deviation0.123583459
Coefficient of variation (CV)-0.001671762344
Kurtosis-0.1187960159
Mean-73.92405951
Median Absolute Deviation (MAD)0.071501375
Skewness-0.6120792448
Sum-50548384.81
Variance0.01527287133
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
-73.7610352935< 0.1%
 
-74.1568609829< 0.1%
 
-73.9568006217< 0.1%
 
-73.7527498417< 0.1%
 
-73.7976474911< 0.1%
 
-73.761069319< 0.1%
 
-73.900453759< 0.1%
 
-73.914414488< 0.1%
 
-73.916666718< 0.1%
 
-73.973292437< 0.1%
 
-73.851436517< 0.1%
 
-73.903178847< 0.1%
 
-73.78450237< 0.1%
 
-73.796832146< 0.1%
 
-74.020780266< 0.1%
 
-74.020804756< 0.1%
 
-74.014972115< 0.1%
 
-73.781598935< 0.1%
 
-73.986284565< 0.1%
 
-73.983427165< 0.1%
 
-73.818663915< 0.1%
 
-73.985160085< 0.1%
 
-73.895470285< 0.1%
 
-73.821800594< 0.1%
 
-73.986152244< 0.1%
 
Other values (677076)683556> 99.9%
 
ValueCountFrequency (%) 
-74.25496471< 0.1%
 
-74.254894521< 0.1%
 
-74.254876271< 0.1%
 
-74.254856621< 0.1%
 
-74.254834161< 0.1%
 
-74.254792221< 0.1%
 
-74.254775041< 0.1%
 
-74.25474861< 0.1%
 
-74.254722171< 0.1%
 
-74.254695731< 0.1%
 
ValueCountFrequency (%) 
-73.700488171< 0.1%
 
-73.700592481< 0.1%
 
-73.700593681< 0.1%
 
-73.700596511< 0.1%
 
-73.700601241< 0.1%
 
-73.70060541< 0.1%
 
-73.700606221< 0.1%
 
-73.700610341< 0.1%
 
-73.700620651< 0.1%
 
-73.700640051< 0.1%
 

x_sp
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count681630
Unique (%)99.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1005279.8605396459
Minimum913349.2661
Maximum1067247.624
Zeros0
Zeros (%)0.0%
Memory size5.2 MiB

Quantile statistics

Minimum913349.2661
5-th percentile937034.4028
Q1989657.8414
median1008386.228
Q31029991.278
95-th percentile1055117.133
Maximum1067247.624
Range153898.3579
Interquartile range (IQR)40333.43662

Descriptive statistics

Standard deviation34285.05448
Coefficient of variation (CV)0.03410498491
Kurtosis-0.1137942786
Mean1005279.861
Median Absolute Deviation (MAD)19812.1789
Skewness-0.6143566834
Sum6.873983053e+11
Variance1175464961
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1050549.64835< 0.1%
 
940697.401428< 0.1%
 
1052818.47717< 0.1%
 
996243.459317< 0.1%
 
1040292.35811< 0.1%
 
1050540.2079< 0.1%
 
1011776.4789< 0.1%
 
1014131.1629< 0.1%
 
1007955.7658< 0.1%
 
1007302.7488< 0.1%
 
991662.97797< 0.1%
 
1043924.8917< 0.1%
 
1025429.87< 0.1%
 
978475.70976< 0.1%
 
978482.51026< 0.1%
 
1011029.1296< 0.1%
 
1040593.0096< 0.1%
 
980099.6545< 0.1%
 
988845.80385< 0.1%
 
988365.23525< 0.1%
 
988053.41955< 0.1%
 
1013221.6515< 0.1%
 
1044751.5525< 0.1%
 
988734.9924< 0.1%
 
1010695.9654< 0.1%
 
Other values (681605)683554> 99.9%
 
ValueCountFrequency (%) 
913349.26611< 0.1%
 
913368.64771< 0.1%
 
913373.68671< 0.1%
 
913379.11351< 0.1%
 
913385.31561< 0.1%
 
913396.89531< 0.1%
 
913401.63871< 0.1%
 
913408.93631< 0.1%
 
913416.23381< 0.1%
 
913423.53111< 0.1%
 
ValueCountFrequency (%) 
1067247.6241< 0.1%
 
1067220.1261< 0.1%
 
1067219.3091< 0.1%
 
1067218.9011< 0.1%
 
1067217.9451< 0.1%
 
1067216.7451< 0.1%
 
1067215.321< 0.1%
 
1067214.9421< 0.1%
 
1067212.3481< 0.1%
 
1067206.7511< 0.1%
 

y_sp
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count682632
Unique (%)99.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean194798.42462483808
Minimum120973.7922
Maximum271894.0921
Zeros0
Zeros (%)0.0%
Memory size5.2 MiB

Quantile statistics

Minimum120973.7922
5-th percentile139145.449
Q1169515.1537
median194560.2525
Q3217019.572
95-th percentile251311.1615
Maximum271894.0921
Range150920.2999
Interquartile range (IQR)47504.41825

Descriptive statistics

Standard deviation32902.06111
Coefficient of variation (CV)0.168903117
Kurtosis-0.6353082377
Mean194798.4246
Median Absolute Deviation (MAD)23708.38335
Skewness0.0627524678
Sum1.332008252e+11
Variance1082545626
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
179953.550935< 0.1%
 
161910.811428< 0.1%
 
190562.439517< 0.1%
 
163692.339717< 0.1%
 
222968.987511< 0.1%
 
261006.64219< 0.1%
 
179954.60249< 0.1%
 
220370.55738< 0.1%
 
251049.1818< 0.1%
 
226259.15887< 0.1%
 
202587.93517< 0.1%
 
169767.02767< 0.1%
 
190703.35366< 0.1%
 
171023.93756< 0.1%
 
255403.74686< 0.1%
 
171017.30436< 0.1%
 
191801.85735< 0.1%
 
156188.62115< 0.1%
 
204737.61225< 0.1%
 
200627.89055< 0.1%
 
191882.77845< 0.1%
 
191800.815< 0.1%
 
217168.51325< 0.1%
 
200935.33115< 0.1%
 
192345.08684< 0.1%
 
Other values (682607)683557> 99.9%
 
ValueCountFrequency (%) 
120973.79221< 0.1%
 
120974.93071< 0.1%
 
120989.63011< 0.1%
 
121001.06481< 0.1%
 
121021.39091< 0.1%
 
121077.1241< 0.1%
 
121093.15481< 0.1%
 
121098.61091< 0.1%
 
121100.99961< 0.1%
 
121122.12621< 0.1%
 
ValueCountFrequency (%) 
271894.09211< 0.1%
 
271853.44351< 0.1%
 
271821.0421< 0.1%
 
271783.33941< 0.1%
 
271780.35381< 0.1%
 
271718.40151< 0.1%
 
271694.19091< 0.1%
 
271635.81931< 0.1%
 
271623.2171< 0.1%
 
271615.23111< 0.1%
 

council district
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count51
Unique (%)< 0.1%
Missing6519
Missing (%)1.0%
Infinite0
Infinite (%)0.0%
Mean30.02733035175093
Minimum1.0
Maximum51.0
Zeros0
Zeros (%)0.0%
Memory size5.2 MiB

Quantile statistics

Minimum1
5-th percentile6
Q119
median30
Q343
95-th percentile51
Maximum51
Range50
Interquartile range (IQR)24

Descriptive statistics

Standard deviation14.30171652
Coefficient of variation (CV)0.4762899783
Kurtosis-1.046661632
Mean30.02733035
Median Absolute Deviation (MAD)12
Skewness-0.1146616275
Sum20336580
Variance204.5390955
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
51508657.4%
 
19336264.9%
 
50329484.8%
 
23304114.4%
 
31227633.3%
 
49208633.1%
 
27200612.9%
 
32196872.9%
 
24186202.7%
 
30183072.7%
 
13174332.5%
 
46168632.5%
 
28156352.3%
 
20141282.1%
 
39138372.0%
 
29137892.0%
 
42130831.9%
 
43130541.9%
 
33127401.9%
 
17118251.7%
 
48118021.7%
 
45117171.7%
 
44116481.7%
 
22116401.7%
 
37109951.6%
 
Other values (26)20892930.6%
 
ValueCountFrequency (%) 
154040.8%
 
255370.8%
 
385061.2%
 
486491.3%
 
549310.7%
 
669471.0%
 
765191.0%
 
873431.1%
 
979941.2%
 
1061370.9%
 
ValueCountFrequency (%) 
51508657.4%
 
50329484.8%
 
49208633.1%
 
48118021.7%
 
4792001.3%
 
46168632.5%
 
45117171.7%
 
44116481.7%
 
43130541.9%
 
42130831.9%
 

census tract
Real number (ℝ≥0)

Distinct count1315
Unique (%)0.2%
Missing6519
Missing (%)1.0%
Infinite0
Infinite (%)0.0%
Mean11957.368422296015
Minimum1.0
Maximum157903.0
Zeros0
Zeros (%)0.0%
Memory size5.2 MiB

Quantile statistics

Minimum1
5-th percentile50
Q1202
median516
Q31417
95-th percentile84601
Maximum157903
Range157902
Interquartile range (IQR)1215

Descriptive statistics

Standard deviation30745.73981
Coefficient of variation (CV)2.571279794
Kurtosis10.97663521
Mean11957.36842
Median Absolute Deviation (MAD)385
Skewness3.349170094
Sum8098354954
Variance945300516.5
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
2080136150.5%
 
17636130.5%
 
19834070.5%
 
13829460.4%
 
22628020.4%
 
2080327400.4%
 
2080427340.4%
 
25125920.4%
 
89225780.4%
 
13425170.4%
 
12224120.4%
 
1701023910.3%
 
2440123850.3%
 
1700523360.3%
 
7022780.3%
 
15122310.3%
 
1700821110.3%
 
15710121030.3%
 
66421000.3%
 
1560120640.3%
 
15790120400.3%
 
1700920120.3%
 
1560219270.3%
 
9319160.3%
 
2440219030.3%
 
Other values (1290)61551690.0%
 
(Missing)65191.0%
 
ValueCountFrequency (%) 
18090.1%
 
27310.1%
 
3271< 0.1%
 
46800.1%
 
67000.1%
 
716580.2%
 
814500.2%
 
96100.1%
 
10159< 0.1%
 
117030.1%
 
ValueCountFrequency (%) 
15790313780.2%
 
15790216950.2%
 
15790120400.3%
 
1571024230.1%
 
15710121030.3%
 
15510215830.2%
 
15290210570.2%
 
1529018060.1%
 
15070210560.2%
 
15070113880.2%
 

bin
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct count378888
Unique (%)56.2%
Missing9559
Missing (%)1.4%
Infinite0
Infinite (%)0.0%
Mean3495439.0063153617
Minimum1000000.0
Maximum5515124.0
Zeros0
Zeros (%)0.0%
Memory size5.2 MiB

Quantile statistics

Minimum1000000
5-th percentile1055120.4
Q13031991
median4020352
Q34263123
95-th percentile5086977
Maximum5515124
Range4515124
Interquartile range (IQR)1231132

Descriptive statistics

Standard deviation1193274.96
Coefficient of variation (CV)0.34138057
Kurtosis-0.5429353008
Mean3495439.006
Median Absolute Deviation (MAD)885878
Skewness-0.6010568435
Sum2.356726346e+12
Variance1.423905131e+12
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
40000005610.1%
 
30000004860.1%
 
20000004010.1%
 
50000003420.1%
 
4315089262< 0.1%
 
1000000260< 0.1%
 
5146614143< 0.1%
 
1067973100< 0.1%
 
511328190< 0.1%
 
106640686< 0.1%
 
514974184< 0.1%
 
443834879< 0.1%
 
446289774< 0.1%
 
515161274< 0.1%
 
444431074< 0.1%
 
429607864< 0.1%
 
428609761< 0.1%
 
453976357< 0.1%
 
101226657< 0.1%
 
211461056< 0.1%
 
211775955< 0.1%
 
208587455< 0.1%
 
334722653< 0.1%
 
334069151< 0.1%
 
515012950< 0.1%
 
Other values (378863)67055498.1%
 
(Missing)95591.4%
 
ValueCountFrequency (%) 
1000000260< 0.1%
 
10000058< 0.1%
 
100000610< 0.1%
 
10000121< 0.1%
 
10000132< 0.1%
 
10000141< 0.1%
 
100001814< 0.1%
 
10000191< 0.1%
 
10000208< 0.1%
 
10000218< 0.1%
 
ValueCountFrequency (%) 
55151243< 0.1%
 
55134511< 0.1%
 
51691695< 0.1%
 
51691687< 0.1%
 
51691661< 0.1%
 
51691611< 0.1%
 
51691532< 0.1%
 
51691422< 0.1%
 
51690934< 0.1%
 
51690701< 0.1%
 

bbl
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct count366635
Unique (%)54.4%
Missing9559
Missing (%)1.4%
Infinite0
Infinite (%)0.0%
Mean3413413626.228289
Minimum0.0
Maximum5080500094.0
Zeros20
Zeros (%)< 0.1%
Memory size5.2 MiB

Quantile statistics

Minimum0
5-th percentile1014673024
Q13011240055
median4008560127
Q34105700010
95-th percentile5056060065
Maximum5080500094
Range5080500094
Interquartile range (IQR)1094459955

Descriptive statistics

Standard deviation1174892044
Coefficient of variation (CV)0.3441985569
Kurtosis-0.522131309
Mean3413413626
Median Absolute Deviation (MAD)962350119
Skewness-0.5327777886
Sum2.301422456e+15
Variance1.380371315e+18
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
20514101204560.1%
 
4153750020262< 0.1%
 
1009720001215< 0.1%
 
2049050001166< 0.1%
 
1009780001147< 0.1%
 
3044520020147< 0.1%
 
5021650160143< 0.1%
 
3045860300134< 0.1%
 
4124950002132< 0.1%
 
1006800001120< 0.1%
 
4076330001120< 0.1%
 
2039437501118< 0.1%
 
3083290225114< 0.1%
 
4079140002113< 0.1%
 
3005380001112< 0.1%
 
2039387501111< 0.1%
 
4067920030109< 0.1%
 
2051350051106< 0.1%
 
4079290002103< 0.1%
 
301117000196< 0.1%
 
304452020095< 0.1%
 
407117000692< 0.1%
 
501725008590< 0.1%
 
408515000289< 0.1%
 
406540000389< 0.1%
 
Other values (366610)67075098.1%
 
(Missing)95591.4%
 
ValueCountFrequency (%) 
020< 0.1%
 
10000475018< 0.1%
 
100005750110< 0.1%
 
10000700311< 0.1%
 
10000700372< 0.1%
 
10000700381< 0.1%
 
10000775014< 0.1%
 
10000875019< 0.1%
 
100009000114< 0.1%
 
10000900071< 0.1%
 
ValueCountFrequency (%) 
50805000942< 0.1%
 
50805000922< 0.1%
 
50805000862< 0.1%
 
50805000832< 0.1%
 
50805000783< 0.1%
 
50805000762< 0.1%
 
50805000683< 0.1%
 
50805000603< 0.1%
 
50805000581< 0.1%
 
50805000533< 0.1%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

tree_idblock_idcreated_attree_dbhstump_diamcurb_locstatushealthspc_latinspc_commonstewardguardssidewalkuser_typeproblemsroot_stoneroot_grateroot_othertrunk_wiretrnk_lighttrnk_otherbrch_lightbrch_shoebrch_otheraddresspostcodezip_citycommunity boardborocodeboroughcncldistst_assemst_senatentanta_nameboro_ctstatelatitudelongitudex_spy_spcouncil districtcensus tractbinbbl
018068334871108/27/201530OnCurbAliveFairAcer rubrumred mapleNoneNoneNoDamageTreesCount StaffNoneNoNoNoNoNoNoNoNoNo108-005 70 AVENUE11375Forest Hills4064Queens292816QN17Forest Hills4073900New York40.723092-73.8442151.027431e+06202756.768729.0739.04052307.04.022210e+09
120054031598609/03/2015210OnCurbAliveFairQuercus palustrispin oakNoneNoneDamageTreesCount StaffStonesYesNoNoNoNoNoNoNoNo147-074 7 AVENUE11357Whitestone4074Queens192711QN49Whitestone4097300New York40.794111-73.8186791.034456e+06228644.837419.0973.04101931.04.044750e+09
220402621836509/05/201530OnCurbAliveGoodGleditsia triacanthos var. inermishoneylocust1or2NoneDamageVolunteerNoneNoNoNoNoNoNoNoNoNo390 MORGAN AVENUE11211Brooklyn3013Brooklyn345018BK90East Williamsburg3044900New York40.717581-73.9366081.001823e+06200716.891334.0449.03338310.03.028870e+09
320433721796909/05/2015100OnCurbAliveGoodGleditsia triacanthos var. inermishoneylocustNoneNoneDamageVolunteerStonesYesNoNoNoNoNoNoNoNo1027 GRAND STREET11211Brooklyn3013Brooklyn345318BK90East Williamsburg3044900New York40.713537-73.9344561.002420e+06199244.253134.0449.03338342.03.029250e+09
418956522304308/30/2015210OnCurbAliveGoodTilia americanaAmerican lindenNoneNoneDamageVolunteerStonesYesNoNoNoNoNoNoNoNo603 6 STREET11215Brooklyn3063Brooklyn394421BK37Park Slope-Gowanus3016500New York40.666778-73.9759799.909138e+05182202.426039.0165.03025654.03.010850e+09
519042210609908/30/2015110OnCurbAliveGoodGleditsia triacanthos var. inermishoneylocust1or2HelpfulNoDamageVolunteerNoneNoNoNoNoNoNoNoNoNo8 COLUMBUS AVENUE10023New York1071Manhattan36727MN14Lincoln Square1014500New York40.770046-73.9849509.884187e+05219825.52273.0145.01076229.01.011310e+09
619042610609908/30/2015110OnCurbAliveGoodGleditsia triacanthos var. inermishoneylocust1or2HelpfulNoDamageVolunteerNoneNoNoNoNoNoNoNoNoNo120 WEST 60 STREET10023New York1071Manhattan36727MN14Lincoln Square1014500New York40.770210-73.9853389.883112e+05219885.27853.0145.01076229.01.011310e+09
720864910394009/07/201590OnCurbAliveGoodTilia americanaAmerican lindenNoneNoneNoDamageVolunteerMetalGratesNoYesNoNoNoNoNoNoNo311 WEST 50 STREET10019New York1041Manhattan37527MN15Clinton1012700New York40.762724-73.9872979.877691e+05217157.85613.0133.01086093.01.010410e+09
820961040744309/08/201560OnCurbAliveGoodGleditsia triacanthos var. inermishoneylocustNoneNoneNoDamageTreesCount StaffNoneNoNoNoNoNoNoNoNoNo65 JEROME AVENUE10305Staten Island5025Staten Island506423SI14Grasmere-Arrochar-Ft. Wadsworth5006400New York40.596579-74.0762559.630732e+05156635.5542NaNNaNNaNNaN
919275520750808/31/2015210OffsetFromCurbAliveFairPlatanus x acerifoliaLondon planetreeNoneNoneNoDamageTreesCount StaffNoneNoNoNoNoNoNoNoNoNo638 AVENUE Z11223Brooklyn3133Brooklyn474523BK26Gravesend3037402New York40.586357-73.9697449.926537e+05152903.630647.037402.03320727.03.072350e+09

Last rows

tree_idblock_idcreated_attree_dbhstump_diamcurb_locstatushealthspc_latinspc_commonstewardguardssidewalkuser_typeproblemsroot_stoneroot_grateroot_othertrunk_wiretrnk_lighttrnk_otherbrch_lightbrch_shoebrch_otheraddresspostcodezip_citycommunity boardborocodeboroughcncldistst_assemst_senatentanta_nameboro_ctstatelatitudelongitudex_spy_spcouncil districtcensus tractbinbbl
68377820067122992809/03/2015290OnCurbAliveGoodPlatanus x acerifoliaLondon planetreeNoneNoneDamageTreesCount StaffStones,BranchLightsYesNoNoNoNoNoYesNoNo1040 EAST 16 STREET11230Brooklyn3143Brooklyn444817BK43Midwood3053200New York40.624296-73.9603449.952583e+05166726.716544.0532.03180008.03.067170e+09
68377919307021043708/31/2015270OnCurbAliveGoodPlatanus x acerifoliaLondon planetreeNoneNoneNoDamageTreesCount StaffStones,WiresRope,BranchLightsYesNoNoYesNoNoYesNoNo2720 QUENTIN ROAD11229Brooklyn3153Brooklyn484117BK44Madison3054800New York40.609541-73.9458359.992893e+05161353.229548.0548.03183517.03.068100e+09
68378019517330437109/01/2015150OnCurbAliveFairPlatanus x acerifoliaLondon planetreeNoneHelpfulNoDamageTreesCount StaffNoneNoNoNoNoNoNoNoNoNo50-017 SKILLMAN AVENUE11377Woodside4024Queens263012QN31Hunters Point-Sunnyside-West Maspeth4016900New York40.746122-73.9136571.008175e+06211120.942026.0169.04000947.04.001290e+09
68378115534823078408/18/2015200OnCurbAliveGoodQuercus palustrispin oakNoneNoneNoDamageTreesCount StaffStonesYesNoNoNoNoNoNoNoNo1040 FLATBUSH AVENUE11226Brooklyn3143Brooklyn404221BK42Flatbush3051001New York40.645694-73.9581799.958556e+05174522.719240.051001.03328218.03.051250e+09
68378218421050430808/29/201530OnCurbAliveGoodQuercus palustrispin oak1or2NoneNoDamageVolunteerNoneNoNoNoNoNoNoNoNoNo2185 VALENTINE AVENUE10457Bronx2052Bronx158633BX41Mount Hope2038100New York40.854570-73.8991921.012137e+06250636.482415.0381.02013548.02.031490e+09
68378315543321797808/18/2015250OnCurbAliveGoodQuercus palustrispin oakNoneNoneDamageVolunteerNoneNoNoNoNoNoNoNoNoNo32 MARCY AVENUE11211Brooklyn3013Brooklyn345318BK73North Side-South Side3051900New York40.713211-73.9549449.967407e+05199121.636334.0519.03062513.03.023690e+09
68378418379534818508/29/201570OnCurbAliveGoodCladrastis kentukeaKentucky yellowwood1or2NoneNoDamageVolunteerNoneNoNoNoNoNoNoNoNoNo67-035 SELFRIDGE STREET11375Forest Hills4064Queens292815QN17Forest Hills4070700New York40.715194-73.8566501.023989e+06199873.647529.0707.04075448.04.031810e+09
68378516616140167008/22/2015120OnCurbAliveGoodAcer rubrumred mapleNoneNoneDamageVolunteerNoneNoNoNoNoNoNoNoNoNo130 BIDWELL AVENUE10314Staten Island5015Staten Island506324SI07Westerleigh5020100New York40.620762-74.1365179.463514e+05165466.076350.0201.05011657.05.004080e+09
68378618402850420408/29/201590OnCurbAliveGoodAcer rubrumred mapleNoneNoneNoDamageTreesCount StaffNoneNoNoNoNoNoNoNoNoNo1985 ANTHONY AVENUE10457Bronx2052Bronx158633BX41Mount Hope2023502New York40.850828-73.9031151.011054e+06249271.950715.023502.02007757.02.028120e+09
68378720060730652709/03/2015230OnCurbAliveFairAcer rubrumred mapleNoneNoneNoDamageTreesCount StaffNoneNoNoNoNoNoNoNoNoNo69-069 183 STREET11365Fresh Meadows4084Queens242511QN41Fresh Meadows-Utopia4134100New York40.732165-73.7875261.043136e+06206095.538324.01341.04153657.04.071360e+09